MERGE and parsing with prepared statements

  • Jump to comment-1
    pryzby@telsasoft.com2022-07-14T16:26:18+00:00
    We've used INSERT ON CONFLICT for a few years (with partitions as the target). That's also combined with prepared statements, for bulk loading. I was looking to see if we should use MERGE (probably not, but looking anyway). And came across this behavior. I'm not sure if it's any issue. CREATE TABLE CustomerAccount (CustomerId int, Balance float); PREPARE p AS MERGE INTO CustomerAccount CA USING (SELECT $1 AS CustomerId, $2 AS TransactionValue) AS T ON CA.CustomerId = T.CustomerId WHEN NOT MATCHED THEN INSERT (CustomerId, Balance) VALUES (T.CustomerId, T.TransactionValue) WHEN MATCHED THEN UPDATE SET Balance = Balance + TransactionValue; ERROR: operator does not exist: integer = text LINE 3: ON CA.CustomerId = T.CustomerId postgres: pryzbyj postgres [local] PREPARE(+0x2337be) [0x56108322e7be] postgres: pryzbyj postgres [local] PREPARE(oper+0x198) [0x56108322f1fb] postgres: pryzbyj postgres [local] PREPARE(make_op+0x7e) [0x56108322f55a] postgres: pryzbyj postgres [local] PREPARE(+0x228f2b) [0x561083223f2b] postgres: pryzbyj postgres [local] PREPARE(+0x227aa9) [0x561083222aa9] postgres: pryzbyj postgres [local] PREPARE(transformExpr+0x1c) [0x5610832227f9] postgres: pryzbyj postgres [local] PREPARE(transformMergeStmt+0x339) [0x56108322d988] postgres: pryzbyj postgres [local] PREPARE(transformStmt+0x70) [0x5610831f4071] postgres: pryzbyj postgres [local] PREPARE(+0x1fa350) [0x5610831f5350] postgres: pryzbyj postgres [local] PREPARE(transformTopLevelStmt+0x11) [0x5610831f5385] postgres: pryzbyj postgres [local] PREPARE(parse_analyze_varparams+0x5b) [0x5610831f54f4] postgres: pryzbyj postgres [local] PREPARE(pg_analyze_and_rewrite_varparams+0x38) [0x5610834bcdfe] postgres: pryzbyj postgres [local] PREPARE(PrepareQuery+0xcc) [0x561083292155] postgres: pryzbyj postgres [local] PREPARE(standard_ProcessUtility+0x4ea) [0x5610834c31a0] Why is $1 construed to be of type text ?
    • Jump to comment-1
      boekewurm+postgres@gmail.com2022-07-15T09:25:35+00:00
      On Thu, 14 Jul 2022, 18:26 Justin Pryzby, <pryzby@telsasoft.com> wrote: > > We've used INSERT ON CONFLICT for a few years (with partitions as the target). > That's also combined with prepared statements, for bulk loading. > > I was looking to see if we should use MERGE (probably not, but looking anyway). > And came across this behavior. I'm not sure if it's any issue. > > CREATE TABLE CustomerAccount (CustomerId int, Balance float); > > PREPARE p AS > MERGE INTO CustomerAccount CA > USING (SELECT $1 AS CustomerId, $2 AS TransactionValue) AS T > ON CA.CustomerId = T.CustomerId > WHEN NOT MATCHED THEN > INSERT (CustomerId, Balance) > VALUES (T.CustomerId, T.TransactionValue) > WHEN MATCHED THEN > UPDATE SET Balance = Balance + TransactionValue; > > ERROR: operator does not exist: integer = text > LINE 3: ON CA.CustomerId = T.CustomerId > > Why is $1 construed to be of type text ? The select statement that generates the row type of T `(select $1 CID, $2 TxV) AS T` does not put type bounds on the input parameters, so it remains `unknown` for the scope of that subselect. Once stored into the row type, the type of that column defaults to text, as a row type should not have 'unknown'-typed columns. You'll see the same issue with other subselects that select input parameters without casts, such as `select a from (select $1 a) A where A.a = 1;`. It's a pre-existing issue that has popped up earlier, and I think it's not something we've planned to fix in backbranches. Kind regards, Matthias van de Meent
      • Jump to comment-1
        pryzby@telsasoft.com2022-07-15T17:58:54+00:00
        On Fri, Jul 15, 2022 at 11:25:35AM +0200, Matthias van de Meent wrote: > On Thu, 14 Jul 2022, 18:26 Justin Pryzby, <pryzby@telsasoft.com> wrote: > > > > Why is $1 construed to be of type text ? > > The select statement that generates the row type of T `(select $1 CID, > $2 TxV) AS T` does not put type bounds on the input parameters, so it > remains `unknown` for the scope of that subselect. Once stored into > the row type, the type of that column defaults to text, as a row type > should not have 'unknown'-typed columns. Thanks for looking into it. I see now that the same thing can happen with "ON CONFLICT" if used with a subselect. PREPARE p AS INSERT INTO t SELECT a FROM (SELECT $1 AS a)a ON CONFLICT (i) DO UPDATE SET i=excluded.i; ERROR: column "i" is of type integer but expression is of type text It seems a bit odd that it's impossible to use merge with prepared statements without specifically casting the source types (which I did now to continue my experiment). -- Justin
        • Jump to comment-1
          alvherre@alvh.no-ip.org2022-07-15T18:40:19+00:00
          On 2022-Jul-15, Justin Pryzby wrote: > On Fri, Jul 15, 2022 at 11:25:35AM +0200, Matthias van de Meent wrote: > Thanks for looking into it. Definitely! Thanks, Matthias. > I see now that the same thing can happen with "ON CONFLICT" if used with a > subselect. > > PREPARE p AS INSERT INTO t SELECT a FROM (SELECT $1 AS a)a > ON CONFLICT (i) DO UPDATE SET i=excluded.i; > ERROR: column "i" is of type integer but expression is of type text Right, I didn't think that MERGE was doing anything peculiar in this respect. > It seems a bit odd that it's impossible to use merge with prepared statements > without specifically casting the source types (which I did now to continue my > experiment). I have no comments on this. Maybe it can be improved, but I don't know how. -- Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
          • Jump to comment-1
            david.g.johnston@gmail.com2022-07-15T19:17:51+00:00
            On Fri, Jul 15, 2022 at 11:40 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > On 2022-Jul-15, Justin Pryzby wrote: > > > It seems a bit odd that it's impossible to use merge with prepared > statements > > without specifically casting the source types (which I did now to > continue my > > experiment). > > I have no comments on this. Maybe it can be improved, but I don't know > how. > > Not tested, but the example prepare command fails to make use of the optional data types specification. Using that should remove the need to cast the parameter placeholder within the query itself. That said, in theory the INSERT specification of the MERGE could be used to either resolve unknowns or even forcibly convert the data types of the relation produced by the USING clause to match the actual types required for the INSERT (since that will happen at insert time anyway). This would make the UPDATE behavior slightly different than a top-level UPDATE command though, since that does not have the same context information. David J.
            • Jump to comment-1
              pryzby@telsasoft.com2022-07-15T19:40:14+00:00
              On Fri, Jul 15, 2022 at 12:17:51PM -0700, David G. Johnston wrote: > On Fri, Jul 15, 2022 at 11:40 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > On 2022-Jul-15, Justin Pryzby wrote: > > > > > It seems a bit odd that it's impossible to use merge with prepared statements > > > without specifically casting the source types (which I did now to continue my > > > experiment). > > > > I have no comments on this. Maybe it can be improved, but I don't know > > how. > > Not tested, but the example prepare command fails to make use of the > optional data types specification. Using that should remove the need to > cast the parameter placeholder within the query itself. What optional data type specification ? > That said, in theory the INSERT specification of the MERGE could be used to > either resolve unknowns or even forcibly convert the data types of the > relation produced by the USING clause to match the actual types required > for the INSERT (since that will happen at insert time anyway). Yeah. I hadn't looked before, but just noticed this: https://www.postgresql.org/docs/devel/sql-merge.html | If the expression for any column is not of the correct data type, automatic type conversion will be attempted. That appears to be copied from the INSERT page. What does that mean, if not that data types will be resolved as needed ? Note that if I add casts to the "ON" condition, MERGE complains about the INSERT VALUES. PREPARE p AS MERGE INTO CustomerAccount CA USING (SELECT $1 AS CustomerId, $2 AS TransactionValue) AS T ON CA.CustomerId = T.CustomerId::int WHEN NOT MATCHED THEN INSERT (CustomerId, Balance) VALUES (T.CustomerId, T.TransactionValue) WHEN MATCHED THEN UPDATE SET Balance = Balance + TransactionValue; ERROR: column "customerid" is of type integer but expression is of type text LINE 7: VALUES (T.CustomerId, T.TransactionValue) postgres: pryzbyj postgres [local] PREPARE(transformAssignedExpr+0x3b6) [0x5605f9699e8e] postgres: pryzbyj postgres [local] PREPARE(transformInsertRow+0x2ce) [0x5605f9653a47] postgres: pryzbyj postgres [local] PREPARE(transformMergeStmt+0x7ec) [0x5605f968fe3b] postgres: pryzbyj postgres [local] PREPARE(transformStmt+0x70) [0x5605f9656071] postgres: pryzbyj postgres [local] PREPARE(+0x1fa350) [0x5605f9657350] postgres: pryzbyj postgres [local] PREPARE(transformTopLevelStmt+0x11) [0x5605f9657385] postgres: pryzbyj postgres [local] PREPARE(parse_analyze_varparams+0x5b) [0x5605f96574f4] postgres: pryzbyj postgres [local] PREPARE(pg_analyze_and_rewrite_varparams+0x38) [0x5605f991edfe] postgres: pryzbyj postgres [local] PREPARE(PrepareQuery+0xcc) [0x5605f96f4155] postgres: pryzbyj postgres [local] PREPARE(standard_ProcessUtility+0x4ea) [0x5605f99251a0] postgres: pryzbyj postgres [local] PREPARE(ProcessUtility+0xdb) [0x5605f992587e] -- Justin
              • Jump to comment-1
                david.g.johnston@gmail.com2022-07-15T19:59:34+00:00
                On Fri, Jul 15, 2022 at 12:40 PM Justin Pryzby <pryzby@telsasoft.com> wrote: > > That appears to be copied from the INSERT page. > What does that mean, if not that data types will be resolved as needed ? > Yep, and the system needs to resolve the type at a point where there is no contextual information and so it chooses text. > Note that if I add casts to the "ON" condition, MERGE complains about the > INSERT VALUES. > > PREPARE p AS > MERGE INTO CustomerAccount CA > USING (SELECT $1 AS CustomerId, $2 AS TransactionValue) AS T > ON CA.CustomerId = T.CustomerId::int > WHEN NOT MATCHED THEN > INSERT (CustomerId, Balance) > VALUES (T.CustomerId, T.TransactionValue) > WHEN MATCHED THEN > UPDATE SET Balance = Balance + TransactionValue; > > ERROR: column "customerid" is of type integer but expression is of type > text > LINE 7: VALUES (T.CustomerId, T.TransactionValue) > > Noted. Not surprised. That error was always present, it's just that the join happens first. Since your fix narrowly targeted the join this error remained to be discovered. David J.
                • Jump to comment-1
                  pryzby@telsasoft.com2022-07-15T20:43:41+00:00
                  On Fri, Jul 15, 2022 at 12:59:34PM -0700, David G. Johnston wrote: > On Fri, Jul 15, 2022 at 12:40 PM Justin Pryzby <pryzby@telsasoft.com> wrote: > > > > | If the expression for any column is not of the correct data type, automatic type conversion will be attempted. > > That appears to be copied from the INSERT page. > > What does that mean, if not that data types will be resolved as needed ? > > Yep, and the system needs to resolve the type at a point where there is no > contextual information and so it chooses text. I don't know if that's a viable interpretation. The parser "resolved the type" by assuming a default, which caused an error before finishing parsing. In the case of INSERT, "conversion will be attempted" means it looks for a cast from the source type to the target type, and its "automatic type conversion" will fail if (for example) you try to insert a timestamp into an int. In the case of MERGE, that same sentence evidently means that it assumes a default source type (rather than looking for a cast) and then fails if it doesn't exactly match the target type. Should that sentence be removed from MERGE ? -- Justin
                  • Jump to comment-1
                    pryzby@telsasoft.com2022-07-18T13:20:00+00:00
                    On Fri, Jul 15, 2022 at 03:43:41PM -0500, Justin Pryzby wrote: > Should that sentence be removed from MERGE ? Also, I think these examples should be more similar. doc/src/sgml/ref/merge.sgml > <programlisting> > MERGE INTO CustomerAccount CA > USING RecentTransactions T > ON T.CustomerId = CA.CustomerId > WHEN MATCHED THEN > UPDATE SET Balance = Balance + TransactionValue > WHEN NOT MATCHED THEN > INSERT (CustomerId, Balance) > VALUES (T.CustomerId, T.TransactionValue); > </programlisting> > </para> > > <para> > Notice that this would be exactly equivalent to the following > statement because the <literal>MATCHED</literal> result does not change > during execution. > > <programlisting> > MERGE INTO CustomerAccount CA > USING (Select CustomerId, TransactionValue From RecentTransactions) AS T > ON CA.CustomerId = T.CustomerId > WHEN NOT MATCHED THEN > INSERT (CustomerId, Balance) > VALUES (T.CustomerId, T.TransactionValue) > WHEN MATCHED THEN > UPDATE SET Balance = Balance + TransactionValue; > </programlisting> > </para> The "ON" lines can be the same. The "MATCHED" can be in the same order. -- Justin
              • Jump to comment-1
                david.g.johnston@gmail.com2022-07-15T19:53:59+00:00
                On Fri, Jul 15, 2022 at 12:40 PM Justin Pryzby <pryzby@telsasoft.com> wrote: > On Fri, Jul 15, 2022 at 12:17:51PM -0700, David G. Johnston wrote: > > On Fri, Jul 15, 2022 at 11:40 AM Alvaro Herrera <alvherre@alvh.no-ip.org> > wrote: > > > On 2022-Jul-15, Justin Pryzby wrote: > > > > > > > It seems a bit odd that it's impossible to use merge with prepared > statements > > > > without specifically casting the source types (which I did now to > continue my > > > > experiment). > > > > > > I have no comments on this. Maybe it can be improved, but I don't know > > > how. > > > > Not tested, but the example prepare command fails to make use of the > > optional data types specification. Using that should remove the need to > > cast the parameter placeholder within the query itself. > > What optional data type specification ? > The one documented here: https://www.postgresql.org/docs/current/sql-prepare.html PREPARE name [ ( data_type [, ...] ) ] AS statement David J.
          • Jump to comment-1
            tgl@sss.pgh.pa.us2022-07-15T19:14:58+00:00
            Alvaro Herrera <alvherre@alvh.no-ip.org> writes: > On 2022-Jul-15, Justin Pryzby wrote: >> I see now that the same thing can happen with "ON CONFLICT" if used with a >> subselect. >> >> PREPARE p AS INSERT INTO t SELECT a FROM (SELECT $1 AS a)a >> ON CONFLICT (i) DO UPDATE SET i=excluded.i; >> ERROR: column "i" is of type integer but expression is of type text > Right, I didn't think that MERGE was doing anything peculiar in this > respect. Yeah. The current theory about this is that if we haven't assigned a type to an unknown-type parameter (or literal) that is an output column of a sub-SELECT, we will as a rule force it to text. That MERGE USING clause is a sub-SELECT, so that rule applies. There is a hoary old exception to that rule, which is that if you write INSERT INTO tab SELECT ..., $1, ... we will figure out the type of the column of "tab" that $1 is going into, and force $1 to that type instead of text. It looks like this also works in INSERT ... VALUES. You could make a case that MERGE should be equally smart, but it's not clear to me that the info is available sufficiently close by to make it reasonable to do that. It looks like the MERGE syntax has a couple of levels of indirection, which'd probably be enough to put the kibosh on that idea -- in particular, AFAICS there might not be a unique target column corresponding to a given data_source column. regards, tom lane